144 research outputs found

    Human Time-Frequency Acuity Beats the Fourier Uncertainty Principle

    Full text link
    The time-frequency uncertainty principle states that the product of the temporal and frequency extents of a signal cannot be smaller than 1/(4π)1/(4\pi). We study human ability to simultaneously judge the frequency and the timing of a sound. Our subjects often exceeded the uncertainty limit, sometimes by more than tenfold, mostly through remarkable timing acuity. Our results establish a lower bound for the nonlinearity and complexity of the algorithms employed by our brains in parsing transient sounds, rule out simple "linear filter" models of early auditory processing, and highlight timing acuity as a central feature in auditory object processing.Comment: 4 pages, 2 figures; Accepted at PR

    Incorporating Inductances in Tissue-Scale Models of Cardiac Electrophysiology

    Get PDF
    In standard models of cardiac electrophysiology, including the bidomain and monodomain models, local perturbations can propagate at infinite speed. We address this unrealistic property by developing a hyperbolic bidomain model that is based on a generalization of Ohm's law with a Cattaneo-type model for the fluxes. Further, we obtain a hyperbolic monodomain model in the case that the intracellular and extracellular conductivity tensors have the same anisotropy ratio. In one spatial dimension, the hyperbolic monodomain model is equivalent to a cable model that includes axial inductances, and the relaxation times of the Cattaneo fluxes are strictly related to these inductances. A purely linear analysis shows that the inductances are negligible, but models of cardiac electrophysiology are highly nonlinear, and linear predictions may not capture the fully nonlinear dynamics. In fact, contrary to the linear analysis, we show that for simple nonlinear ionic models, an increase in conduction velocity is obtained for small and moderate values of the relaxation time. A similar behavior is also demonstrated with biophysically detailed ionic models. Using the Fenton-Karma model along with a low-order finite element spatial discretization, we numerically analyze differences between the standard monodomain model and the hyperbolic monodomain model. In a simple benchmark test, we show that the propagation of the action potential is strongly influenced by the alignment of the fibers with respect to the mesh in both the parabolic and hyperbolic models when using relatively coarse spatial discretizations. Accurate predictions of the conduction velocity require computational mesh spacings on the order of a single cardiac cell. We also compare the two formulations in the case of spiral break up and atrial fibrillation in an anatomically detailed model of the left atrium, and [...].Comment: 20 pages, 12 figure

    Pitch strength of normal and dysphonic voices

    Get PDF
    Two sounds with the same pitch may vary from each other based on saliency of their pitch sensation. This perceptual attribute is called “pitch strength.” The study of voice pitch strength may be important in quantifying of normal and pathological qualities. The present study investigated how pitch strength varies across normal and dysphonic voices. A set of voices (vowel /a/) selected from the Kay Elemetrics Disordered Voice Database served as the stimuli. These stimuli demonstrated a wide range of voice quality. Ten listeners judged the pitch strength of these stimuli in an anchored magnitude estimation task. On a given trial, listeners heard three different stimuli. The first stimulus represented very low pitch strength (wide-band noise), the second stimulus consisted of the target voice and the third stimulus represented very high pitch strength (pure tone). Listeners estimated pitch strength of the target voice by positioning a continuous slider labeled with values between 0 and 1, reflecting the two anchor stimuli. Results revealed that listeners can judge pitch strength reliably in dysphonic voices. Moderate to high correlations with perceptual judgments of voice quality suggest that pitch strength may contribute to voice quality judgments

    Presence of 1/f noise in the temporal structure of psychoacoustic parameters of natural and urban sounds.

    Get PDF
    1/f noise or pink noise, which has been shown to be universal in nature, has also been observed in the temporal envelope of music, speech, and environmental sound. Moreover, the slope of the spectral density of the temporal envelope of music has been shown to correlate well to its pleasing, dull, or chaotic character. In this paper, the temporal structure of a number of instantaneous psychoacoustic parameters of environmental sound is examined in order to investigate whether a 1/f temporal structure appears in various types of sound that are generally preferred by people in everyday life. The results show, to some extent, that different categories of environmental sounds have different temporal structure characteristics. Only a number of urban sounds considered and birdsong, generally, exhibit 1/f behavior on short to medium duration time scales, i.e., from 0.1 s to 10 s, in instantaneous loudness and sharpness, whereas a more chaotic variation is found in birdsong at longer time scales, i.e., of 10 s-200 s. The other sound categories considered exhibit random or monotonic variations in the different time scales. In general, this study shows that a 1/f temporal structure is not necessarily present in environmental sounds that are commonly perceived as pleasant

    Real-time Soundprism

    Full text link
    [EN] This paper presents a parallel real-time sound source separation system for decomposing an audio signal captured with a single microphone in so many audio signals as the number of instruments that are really playing. This approach is usually known as Soundprism. The application scenario of the system is for a concert hall in which users, instead of listening to the mixed audio, want to receive the audio of just an instrument, focusing on a particular performance. The challenge is even greater since we are interested in a real-time system on handheld devices, i.e., devices characterized by both low power consumption and mobility. The results presented show that it is possible to obtain real-time results in the tested scenarios using an ARM processor aided by a GPU, when this one is present.This work has been supported by the "Ministerio de Economia y Competitividad" of Spain and FEDER under projects TEC2015-67387-C4-{1,2,3}-R.Muñoz-Montoro, AJ.; Ranilla, J.; Vera-Candeas, P.; Combarro, EF.; Alonso-Jordá, P. (2019). Real-time Soundprism. The Journal of Supercomputing. 75(3):1594-1609. https://doi.org/10.1007/s11227-018-2703-0S15941609753Alonso P, Cortina R, Rodríguez-Serrano FJ, Vera-Candeas P, Alonso-González M, Ranilla J (2017) Parallel online time warping for real-time audio-to-score alignment in multi-core systems. J Supercomput 73:126. https://doi.org/10.1007/s11227-016-1647-5Carabias-Orti JJ, Cobos M, Vera-Candeas P, Rodríguez-Serrano FJ (2013) Nonnegative signal factorization with learnt instrument models for sound source separation in close-microphone recordings. EURASIP J Adv Signal Process 2013:184. https://doi.org/10.1186/1687-6180-2013-184Carabias-Orti JJ, Rodriguez-Serrano FJ, Vera-Candeas P, Canadas-Quesada FJ, Ruiz-Reyes N (2015) An audio to score alignment framework using spectral factorization and dynamic time warping. In: 16th International Society for Music Information Retrieval Conference, pp 742–748Díaz-Gracia N, Cocaña-Fernández A, Alonso-González M, Martínez-Zaldívar FJ, Cortina R, García-Mollá VM, Alonso P, Ranilla J (2014) NNMFPACK: a versatile approach to an NNMF parallel library. In: Proceedings of the 2014 International Conference on Computational and Mathematical Methods in Science and Engineering, pp 456–465Díaz-Gracia N, Cocaña-Fernández A, Alonso-González M, Martínez-Zaldívar FJ, Cortina R, García-Mollá VM, Vidal AM (2015) Improving NNMFPACK with heterogeneous and efficient kernels for β\beta β -divergence metrics. J Supercomput 71:1846–1856. https://doi.org/10.1007/s11227-014-1363-yDriedger J, Grohganz H, Prätzlich T, Ewert S, Müller M (2013) Score-informed audio decomposition and applications. In: Proceedings of the 21st ACM International Conference on Multimedia, pp 541–544Duan Z, Pardo B (2011) Soundprism: an online system for score-informed source separation of music audio. IEEE J Sel Top Signal Process 5(6):1205–1215Duong NQ, Vincent E, Gribonval R (2010) Under-determined reverberant audio source separation using a full-rank spatial covariance model. IEEE Trans Audio Speech 18(7):1830–1840. https://doi.org/10.1109/TASL.2010.2050716Ewert S, Müller M (2011) Estimating note intensities in music recordings. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pp 385–388Ewert S, Pardo B, Mueller M, Plumbley MD (2014) Score-informed source separation for musical audio recordings: an overview. IEEE Signal Process Mag 31:116–124. https://doi.org/10.1109/MSP.2013.2296076Fastl H, Zwicker E (2007) Psychoacoustics. Springer, BerlinGanseman J, Scheunders P, Mysore GJ, Abel JS (2010) Source separation by score synthesis. Int Comput Music Conf 2010:1–4Goto M, Hashiguchi H, Nishimura T, Oka R (2002) RWC music database: popular, classical and jazz music databases. In: ISMIR, vol 2, pp 287–288Goto M (2004) Development of the RWC music database. In: Proceedings of the 18th International Congress on Acoustics (ICA 2004), ppp 553–556Hennequin R, David B, Badeau R (2011) Score informed audio source separation using a parametric model of non-negative spectrogram. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp 45–48. https://doi.org/10.1109/ICASSP.2011.5946324Itoyama K, Goto M, Komatani K et al (2008) Instrument equalizer for query-by-example retrieval: improving sound source separation based on integrated harmonic and inharmonic models. In: ISMIR. https://doi.org/10.1136/bmj.324.7341.827Marxer R, Janer J, Bonada J (2012) Low-latency instrument separation in polyphonic audio using timbre models. In: International Conference on Latent Variable Analysis and Signal Separation, pp 314–321Miron M, Carabias-Orti JJ, Janer J (2015) Improving score-informed source separation for classical music through note refinement. In: ISMIR, pp 448–454Ozerov A, Févotte C (2010) Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Trans Audio Speech Lang Process 18:550–563. https://doi.org/10.1109/TASL.2009.2031510Ozerov A, Vincent E, Bimbot F (2012) A general flexible framework for the handling of prior information in audio source separation. IEEE Trans Audio Speech Lang Process 20:1118–1133. https://doi.org/10.1109/TASL.2011.2172425Pätynen J, Pulkki V, Lokki T (2008) Anechoic recording system for symphony orchestra. Acta Acust United Acust 94:856–865. https://doi.org/10.3813/AAA.918104Raphael C (2008) A classifier-based approach to score-guided source separation of musical audio. Comput Music J 32:51–59. https://doi.org/10.1162/comj.2008.32.1.51Rodriguez-Serrano FJ, Duan Z, Vera-Candeas P, Pardo B, Carabias-Orti JJ (2015) Online score-informed source separation with adaptive instrument models. J New Music Res 44:83–96. https://doi.org/10.1080/09298215.2014.989174Rodriguez-Serrano FJ, Carabias-Orti JJ, Vera-Candeas P, Martinez-Munoz D (2016) Tempo driven audio-to-score alignment using spectral decomposition and online dynamic time warping. ACM Trans Intell Syst Technol 8:1–20. https://doi.org/10.1145/2926717Sawada H, Araki S, Makino S (2011) Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment. IEEE Trans Audio Speech Lang Process 19(3):516–527. https://doi.org/10.1109/TASL.2010.2051355Vincent E, Araki S, Theis F et al (2012) The signal separation evaluation campaign (2007–2010): achievements and remaining challenges. Signal Process 92:1928–1936. https://doi.org/10.1016/j.sigpro.2011.10.007Vincent E, Bertin N, Gribonval R, Bimbot F (2014) From blind to guided audio source separation: how models and side information can improve the separation of sound. IEEE Signal Process Mag 31:107–115. https://doi.org/10.1109/MSP.2013.229744

    Frame Theory for Signal Processing in Psychoacoustics

    Full text link
    This review chapter aims to strengthen the link between frame theory and signal processing tasks in psychoacoustics. On the one side, the basic concepts of frame theory are presented and some proofs are provided to explain those concepts in some detail. The goal is to reveal to hearing scientists how this mathematical theory could be relevant for their research. In particular, we focus on frame theory in a filter bank approach, which is probably the most relevant view-point for audio signal processing. On the other side, basic psychoacoustic concepts are presented to stimulate mathematicians to apply their knowledge in this field

    Short and Intense Tailor-Made Notched Music Training against Tinnitus: The Tinnitus Frequency Matters

    Get PDF
    Tinnitus is one of the most common diseases in industrialized countries. Here, we developed and evaluated a short-term (5 subsequent days) and intensive (6 hours/day) tailor-made notched music training (TMNMT) for patients suffering from chronic, tonal tinnitus. We evaluated (i) the TMNMT efficacy in terms of behavioral and magnetoencephalographic outcome measures for two matched patient groups with either low (≤8 kHz, N = 10) or high (>8 kHz, N = 10) tinnitus frequencies, and the (ii) persistency of the TMNMT effects over the course of a four weeks post-training phase. The results indicated that the short-term intensive TMNMT took effect in patients with tinnitus frequencies ≤8 kHz: subjective tinnitus loudness, tinnitus-related distress, and tinnitus-related auditory cortex evoked activity were significantly reduced after TMNMT completion. However, in the patients with tinnitus frequencies >8 kHz, significant changes were not observed. Interpreted in their entirety, the results also indicated that the induced changes in auditory cortex evoked neuronal activity and tinnitus loudness were not persistent, encouraging the application of the TMNMT as a longer-term training. The findings are essential in guiding the intended transfer of this neuro-scientific treatment approach into routine clinical practice

    Illusory Percepts from Auditory Adaptation

    Get PDF
    Phenomena resembling tinnitus and Zwicker phantom tone are seen to result from an auditory gain adaptation mechanism that attempts to make full use of a fixed-capacity channel. In the case of tinnitus, the gain adaptation enhances internal noise of a frequency band otherwise silent due to damage. This generates a percept of a phantom sound as a consequence of hearing loss. In the case of Zwicker tone, a frequency band is temporarily silent during the presentation of a notched broad-band sound, resulting in a percept of a tone at the notched frequency. The model suggests a link between tinnitus and the Zwicker tone percept, in that it predicts different results for normal and tinnitus subjects due to a loss of instantaneous nonlinear compression. Listening experiments on 44 subjects show that tinnitus subjects (11 of 44) are significantly more likely to hear the Zwicker tone. This psychoacoustic experiment establishes the first empirical link between the Zwicker tone percept and tinnitus. Together with the modeling results, this supports the hypothesis that the phantom percept is a consequence of a central adaptation mechanism confronted with a degraded sensory apparatus

    Auditory-inspired morphological processing of speech spectrograms: applications in automatic speech recognition and speech enhancement

    Get PDF
    New auditory-inspired speech processing methods are presented in this paper, combining spectral subtraction and two-dimensional non-linear filtering techniques originally conceived for image processing purposes. In particular, mathematical morphology operations, like erosion and dilation, are applied to noisy speech spectrograms using specifically designed structuring elements inspired in the masking properties of the human auditory system. This is effectively complemented with a pre-processing stage including the conventional spectral subtraction procedure and auditory filterbanks. These methods were tested in both speech enhancement and automatic speech recognition tasks. For the first, time-frequency anisotropic structuring elements over grey-scale spectrograms were found to provide a better perceptual quality than isotropic ones, revealing themselves as more appropriate—under a number of perceptual quality estimation measures and several signal-to-noise ratios on the Aurora database—for retaining the structure of speech while removing background noise. For the second, the combination of Spectral Subtraction and auditory-inspired Morphological Filtering was found to improve recognition rates in a noise-contaminated version of the Isolet database.This work has been partially supported by the Spanish Ministry of Science and Innovation CICYT Project No. TEC2008-06382/TEC.Publicad
    corecore